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O Abstract 

<N 

i Pandemics can cause immense disruption and damage to communities and societies. Thus 

far, modeling of pandemics has focused on either large-scale difference equation models like the 
SIR and the SEIR models, or detailed micro-level simulations, which are harder to apply at 
a global scale. This paper introduces a hybrid model for pandemics considering both global 
and local spread of infections. We hypothesize that the spread of an infectious disease between 
,_, regions is significantly influenced by global traffic patterns and the spread within a region is 

■^^ influenced by local conditions. Thus we model the spread of pandemics considering the con- 

nections between regions for the global spread of infection and population density based on the 
SEIR model for the local spread of infection. We validate our hybrid model by carrying out a 
& simulation study for the spread of SARS pandemic of 2002-2003 using available data on popula- 

tion, population density, and traffic networks between different regions. While it is well-known 
that international relationships and global traffic patterns significantly influence the spread of 
pandemics, our results show that integrating these factors into relatively simple models can 
greatly improve the results of modeling disease spread. 
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-^- 1 Introduction 

O- 

Modeling of the spread of infectious disease typically falls into one of two categories. Analytically 

tractable models like the SEIR model are capable of capturing some globally important phenomena 
like the rate of spread of diseases using few parameters. However, they have a hard time reflecting 
differences in global spread due to local conditions. For example, it can be difficult to model 
S^ different rates of spread in countries with different population densities and public health policies 

of variable strength and coordination. Network- or agent-based models are capable of reflecting 
details of individual conditions. However, modeling large-scale global disease-spread using such 
models often runs into methodological problems like overfitting because of the vast number of 
possible parameters. 

This paper proposes a granular, network-based hybrid model of disease spread in which individ- 
ual regions are modeled as nodes in the network, and the spread of disease within nodes is modeled 
analytically (using a simplified derivative of the SEIR model) with the help of demographic pa- 
rameters like population density. The properties of the network as a whole, like connectivity, are 
determined using real data on traffic between regions. We demonstrate the power of this approach 
by simulating the spread of SARS . One of the key takeaways is that the level of granularity has 
a significant effect on the success of network- or agent-based simulation models. For example, 



we show that modeling China as an individual node is unsuccessful, whereas breaking it up into 
constituent regions gives an impressive match to real infection data on SARS. 

One of the great advantages of our model is its parsimony: it contains relatively few tweakable 
parameters compared with general agent-based models. At the same time it is capable of repro- 
ducing the important broad flows of disease. However, it is important to remember that exact 
reproduction of historical data is not the end-goal. Exceptions that do not correspond to real data 
provide insight into specific local phenomena that influenced the progression of a pandemic, such 
as an actual timing of the first infected case in a country. 

1.1 Related Work 

There is a vast literature on understanding the spread of disease using analytical and simulation 
models. In the next section we give a brief overview of the most common modeling methodologies, 
including differential equation models and simulation models, but here we discuss related research 
more generally. The most closely related to this work can be grouped into two categories. First, 
several researchers have simulated and analyzed the local spread of SARS in 2002-2003 [20 ^ 123 ] 150] . 
In particular, Huang, et al reproduce the situations in Singapore, Taipei, and Toronto individually, 
and compare with the actual transitions |20j. This also ties in to a significant existing literature 
on local modeling of historical pandemics, like the Influenza during the First World War (e.g. 
[6] [26]). Other examples also abound: Jenvald, et al use a virtual city based on Linkoping, Sweden, 
considering the number of schools, age distribution, and household type |21j ; Longini, et al model 
population and contact processes based on Thailand census data, demographic information, and 
social network data [25]; Kelso, et al model a real community in the south west of Western Australia 

[221 m\- 

The second category involves simulating global infection spread using international traffic data. 
For example, several papers use air travel data to estimate connectivity in a network [HI |9l [T31 
115] . However, these authors typically simulate a hypothetical global pandemic, with a focus on 
intervention policies; the focus of our research is to validate the simulation with real historical 
data. 

Much existing research simulates infection in networks with reasonable properties, but not 
necessarily based on existing real-world data. For example, Bailey simulates epidemics in two 
dimensions, such as square grids [I]. Patel, et al [39J and Weycker, et al [33] consider hypothetical 
populations of 10,000 persons, comprised of five communities of equal size, containing schools and 
neighborhoods. Vespignani, Pastor-Satorras, and co-authors simulate spreading infectious diseases 
with complex networks [21 [23 EH ESI EFJ EZ]. Carrat, et al [5], Glass, et al [18j, and Eubank [13] 
also use generated complex networks for simulation. 

Another major theme of research has been on the effects of prevention and/or mitigation strate- 
gies. These typically compare a "base" simulation and an alternative simulation which considers 
some proposed strategy. For example, Longini, et al use stochastic epidemic simulations to in- 
vestigate the effectiveness of targeted antiviral prophylaxis to contain influenza [24J. Kelso, et al 
simulate the effect of social isolation, such as school closure, individual isolation, workplace nonat- 
tendance, and reduction of contact [22]. Carrat, et al explore the impact of interventions, such 
as vaccination, treatment, quarantine, and closure of schools and workplaces [5]. Germann, et al 
simulate and compared the baseline and several combinations of mitigation methods |17j . Patel, et 
al use genetic algorithms to find optimal vaccination strategies [39]. Weycker, et al estimate the 
population- wide benefits of routine vaccination of children [33] . 
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Figure 1: SARS Map: Cumulative number of reported cases as of April 8, 2003 [17] 



1.2 The SARS Pandemic 

The SARS pandemic of 2002 is a useful case study for our modeling methodology. The pandemic 
spread to 29 countries/regions in 2002 and 2003. In total 8,096 people were infected and 774 people 
died as of December 31, 2003 08]. Figure [I] shows the spread of SARS as of April 8, 2003 [37]. 
In 20 of 29 countries/regions, 100% of total cases in the country were "imported" (as defined by 
WHO) from other countries |16j . 

The SARS pandemic is a particularly useful case study because we have high-fidelity data on 
the outbreak. First, the beginning and end of the pandemic are clear. According to the WHO, 
the first case was a male in his 40's in Guangdong, China, in November 2002. SARS started 
substantially spreading from Hong Kong to other countries in February 2003, infecting 29 countries 
and regions by July 2003. After that, there were no new cases except one infection through a 
laboratory accident. Second, the number of cases is clearly reported (and relatively small). WHO 
reported the cumulative number of cases and the number of new infected cases from March 17th 
to July 11th 2003 |45j . Third, the number of infected countries is clearly shown. There are 29 
countries/regions which are infected by SARS by the end of 2003 [48J . Thus we have good data on 



the progress of infection in different countries and regions. 

2 Modeling the Spread of Disease 

We first introduce the main methodologies for modeling the spread of infectious diseases before 
describing our approach in detail. 

2.1 Infectious Disease Models 

2.1.1 SIR Model 

The classic SIR model, proposed by Kermack and McKendrick in 1927 |3j, posits three classes 
of agents; Susceptible, Infectious, and Removed. Susceptible agents (hereafter denoted S) are 
vulnerable to a disease and have the potential to be infected. Infectious agents (I) are currently 
infected and have the risk of infecting S. Removed (R) agents are removed from the system - they 
are either dead or acquired immunity. 

Thus R is not infected again. R is also called Recovered when we assume it is not dead. When 
R is not dead but has instead acquired immunity, the total population, (S + I + R), is constant. 
The model assumes that agents in the set S are sometimes infected by a contact in I and change 
to R at a constant rate. This yields the expressions below for the transition of populations of these 
three classes. 

f t =psi-\i (1) 

dt 

where (3 is the rate of infection from S to I and A is the rate of recovery from I to R. A is inversely 
proportion to the average infectious period, r : A = r . When we assume that the population is 
constant in this case, the total population N is given by S + 1 + R. When j3/\ > 1, the infection 
spreads since the probability that S becomes I is greater than the speed that I becomes R. The 
basic reproduction number, Rq, is the average number of persons infected by a single infected 
person when the population has no immunity and no control against the infection [39]. In the SIR 
differential equation model, the basic reproduction number is given by Rq = N/3/X. If one infected 
person infects more than one susceptible person (i.e., Rq > 1), secondary infection occurs and the 
infection spreads. On the other hand, if -Ro < 1> the disease converges in the system. Therefore 
Ro = 1 is a threshold for spread. 

2.1.2 SEIR Model 

The SEIR model is a derivative of the SIR model. SIR doesn't consider the incubation period. 
Thus, when S is infected, it becomes I immediately and starts to infect other S [13]. In. the real 
world, there is some duration between the time that a person is infected and the time that he/she 



starts infecting others. The SEIR model denotes agents in the incubation period as belonging to 
class E (exposed) |19j . The corresponding transition equations are: 
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2.1.3 Network- and Agent-Based Models 

Agent-based modeling provides an explicit, local method of understanding the spread of infection. It 
allows for fine-grained control over many aspects of the dynamic model of disease spread, including 
geographic factors and agent movements. For example, Carley, et al [1] simulate the spread of 
anthrax and Epstein [12J investigates the spread of smallpox with agent based models. Deguchi, 
et al have developed an Agent Based simulation language called SOARS, Spot Oriented Agent 
Role Simulator |11| [TU] for simulating the spread of disease considering modules such as human 
activities, opportunity for contact between people in a society, disease state, and intervention to 
control the spread. 

Network-based models typically represent agents as nodes on graphs and allow the connectivity 
structure of the graph to determine the possible spread of disease. For example, extending an SIR 
model to networks would involve allowing a susceptible vertex S to be infected by an infectious 
vertex / only if S is adjacent to I. Network-based models are useful in that they can reflect social 
and economic networks. People's behaviors and social contacts build the network and the infection 
route is on the network. 

2.2 Our Approach 

Our model uses local regions and interconnections between them. There are three possibilities for 
a new infection in a region; (1) infection from travelers from outside the region, (2) infection from 
returning travelers, and (3) infection from local persons. We denote infection types (1) and (2) as 
"global" infections and type (3) as "local" infections. Figure k] shows the basic structure of our 
model. 



Global and Local Infections We assume that infection starts in a particular country or region 
and spreads from there. At each cycle, infections of all types can occur. Global infections (types 1 
and 2) occur with frequencies that are dependent on the level of travel between regions, and local 
infections are mostly dependent on the population density of a region (details of the data used 
are below). Our local model is based on the concept behind the SEIR model. We consider the 
same four types of agents in each region: Susceptible, Exposed, Infectious, and Removed. When an 
infection occurs, agents are considered exposed. The model proceeds in time cycles t. The number 
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Figure 2: Structure of the simulation model 

of agents newly exposed in region i at time t through the global infection mechanism is modeled 
as EGj(i) = ^2jlj(t) • Tij • -P^(i), where the sum is over other regions, 7y is the sum of travelers 
from region i to region j and the number of travelers from region j to region i (since infection can 
occur through both arriving and returning travelers), and P(j{t) is a "global infection coefficient" 
at time t, described below. 

Local infection follows a similar process, so that the number of agents newly exposed through 
the local infection mechanism at any time t is given by ELj(i) = Si(t) ■ Ii(t) ■ P^it) where P[A€) 
is a "local infection coefficient" (similar to the global infection coefficient, both are described in 
detail below). 

It is assumed that agents go from exposed to infectious according to some incubation period 
that is disease-specific, and, similarly, from infectious to removed according to some disease-specific 
recovery period. For the purposes of this paper, we set these to 10 for both incubation period and 
infectious period, but these parameters can of course be varied for modeling other diseases. 



Infection probabilities As awareness of a disease spreads, it is likely that heightened awareness 
and prevention measures start to reduce the spread of infection. We model this in our global and 
local infection coefficients, by introducing a term that dampens the coefficient over time. For global 
infection, we use Pq(£) = Pg~ (Dct) where Pq is a basic global infection coefficient, held constant 
across regions, and Dq models the dampening effect. 

We use a similar equation for the local infection coefficient, P^it) = Pu — (Dl -t). In this case, 
Dl is assumed constant across regions, but the basic local infection coefficient Pu is region-specific, 
given by Pu = pi ■ C\ + C2 where pi is the population density of region i, assumed to be the primary 
driver of high local infection rates. 

It is worth noting that the original SEIR model gives a similar type of equation for newly exposed 
agents E = /3 ■ S ■ I, where /3 is the infection rate. The main novelty here is the combination of 
modeling a declining infection rate, and treating each region separately. 



3 Calibration With Data 

There are several model parameters that need to be calibrated using real data. It is useful to 
consider some background information on the characteristics of SARS in this context. 

Characteristics of SARS The SARS Coronavirus causes general infection with Viremia, es- 
pecially severe pneumonia and intestine infection. It is transmitted primarily through droplet 
infection. Due to its resistance to dryness, it can also be transmitted through air. It is thought 
that the incubation period of SARS is usually 2-10 days and the average is 5 days [32J. In the 
pandemic of 2002-2003, most countries reported a median incubation period of 4-5 days, and a 
mean of 4-6 days. In the incubation period, it is unlikely an infected person will spread the disease 
through droplet infection. The infectious period is thought to be about two weeks, with its peak 
from the 7th-10th day after infection [32]. Transmission efficiency appears to be greatest from 
severely ill patients who are experiencing rapid clinical deterioration, usually during the second 
week of illness. Maximum virus excretion from the respiratory tract occurs on about day 10 of 
illness and then declines to 0% by day 23. There are no reports on transmission beyond 10 days of 
fever resolution [12]. The death rate varies by age group (SARS affects older patients much more 
severely), but the overall death rate was about 9.6% in the 2002-2003 SARS pandemic, significantly 
higher than that of seasonal Influenzas. Another notable feature of SARS is that it is believed that 
"super-spreading" events, where a person infects many more than the average rate of infection, 
are a key component in its transmission. Our model does not deal explicitly with such levels of 
granularity, which may lead to some outlier predictions in areas where the law of large numbers 
does not take over. This is discussed further in Section [5] 

3.1 Correlation between Pandemic and Traffic 

It is thought that the origin of SARS was Guangdong in China, quickly spreading to Hong Kong. 
Thus we consider countries/regions which have strong relationships with China and Hong Kong. 
At first we examine the numbers of travelers from China and Hong Kong and consider the ten 
countries/regions where the number of travelers to and from China and Hong Kong is the largest 
(see Table [l]), yielding a total of 17 countries. 16 of these 17 countries/regions were infected by 
SARS. Since there were 29 countries/regions in total with reported cases of SARS, half of them 
are represented in this table. Besides these 16 countries/regions, there are 13 other infected coun- 
tries by SARS; Canada, France, India, Indonesia, Italy, Kuwait, New Zealand, Ireland, Romania, 
South Africa, Spain, Sweden, Switzerland. We focus on these 30 countries/regions in our experi- 
ments. There are 8 countries/regions which had local spread: China, Hong Kong, Taiwan, Canada, 
Singapore, Vietnam, Philippines, and Mongolia. 7 of these 8 are included in Table [T] 

3.2 Correlation between Local Infection and Population Density 

We hypothesize that population density of an area is positively correlated with the local infection 
rate, because higher population densities lead to more frequent contact. We test this hypothesis 
using data from Chinese provinces, Hong Kong, and Taiwan, the most significant infected regions. 
Figure [3] shows how the number of infections in different Chinese provinces varied greatly at the peak 
of the infection (from [31] ) , which makes it necessary to treat the individual regions separately. Since 
97% of infections occur in 6 provinces, we use data from these 6. They are Guangdong Province 



Table 1: Top 10 countries/regions in terms of number of travelers from/to China and Hong Kong in 
2003 (dark-gray: country with local infection, light-gray: country with only imported cases, white: 
country without local Infection or imported cases, Created based on [331 IM]) 
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Figure 3: Map of SARS cases by province in China as of May 18, 2003 [3T] 

(the initial infected province), Beijing Municipality, Shanxi Province, Inner Mongolia Autonomous 
Region, Hebei Province, and Tianjin Municipality. Table [2] shows basic data on population and 
density for each of the provinces, Hong Kong, and Taiwan. Using these 6 provinces, Hong Kong, 
and Taiwan, we can reject the null hypothesis that there is no correlation between population 
density and infection rate at the 0.01-level. We choose the values for CI and C2 by trial and error. 



3.3 Passenger Traffic 

Our initial simulations are focused on the 6 regions of China, Taiwan, and Hong Kong. Table [3] 
shows the number of travelers among the three countries. However, it is difficult to estimate travel 
between the regions of China, or to allocate travelers from China to the other countries amongst 
the regions of China. 

In order to approximate this travel information, we use data on passenger land traffic and civil 
aviation in China in 2007 [30]. Tabled shows the each passenger traffic and the total [30] . We use 
land traffic data for the 6 provinces we are interested in, as shown in Table p] [30]. We compute the 
share of each region in the total, where the total share is 100. 

Then, based on Tables [4] and [5] we approximate the number of travelers between two regions 
by assuming that the share of a region is directly proportional to the number of travelers to the 
region. Also we assume that the share of passenger traffic by air is proportional to the share of 
passenger traffic by land. 

We estimate travel between the different regions of China and Hong Kong and Taiwan by using 



Table 2: Population, area, and population density in 6 provinces in mainland China, Hong Kong, 
and Taiwan [2511301131] 



Population Area Population Density 

(sq. km) (per sq. km) 



Beijing 

Guangdong 

Hebei 

Hong Kong 

Inner Mongolia 

Shanxi 

Taiwan 

Tianjin 



17,422,637 


16,801 


1,037 


83,079,300 


177,900 


467 


68,135,100 


187,700 


363 


6,708,940 


1,108 


6,055 


23,660,000 


1,183,000 


20 


33,398,400 


156,800 


213 


23,067,604 


36,006 


611 


11,760,000 


11,760 


1,000 



Table 3: Number of travelers between the three regions in 2003 



China 
Hong Kong 
Taiwan 



China Hong Kong Taiwan 



5,692,500 NA* 

58,770,063 - 287,312 

2,731,897 407,100 



Row: Origin, Column: Destination 

*Civil travel from China to Taiwan was not permitted 
in 2003 (Lifted on July 18, 2008) 



Table 4: Total passenger traffic in China in 2007 [30 



Railways Highways Waterways Civil Aviation Total (Total of Land) 

Passengers 135,670 2,050,680 22,835 18,576 2,227,761 (2,209,185) 

(10,000 persons) 

Share in Total (%) 6.09 92.05 1.03 0.83 100 (99.17) 
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Table 5: Passenger traffic by region in China in 2007 (civil aviation traffic is not included) 





Passengers 


Share in Total (%) 




(10,000 persons) 




Beijing 


16,190 


0.732849005 


Tianjin 


6,829 


0.309118965 


Hebei 


88,886 


4.023496235 


Shanxi 


43,866 


1.985620174 


Inner Mongolia 


38,678 


1.750778371 


Guangdong 


199,162 


9.015179308 


Others 


1,815,574 


82.18295794 


National Total 


2,209,185 


100 



Table 6: Passengers passing through the main airport in 6 regions of China in 2007 [7J 





Passengers 


Share in Total (%) 


Beijing 


55,938,136 


13.7859 


Tianjin 


4,637,299 


1.1429 


Hebei 


1,043,688 


0.2572 


Shanxi 


4,312,910 


1.0629 


Inner Mongolia 


2,121,905 


0.5229 


Guangdong * 


54,835,981 


13.5143 


Others 


282,872,185 


69.7138 


National Total 


405,762,104 


100 



*Including both Guangzhou airport and Shenzhen airport 



the share of the airport of each region in China in the national total. Table [6] shows the number 
of passengers using the main airport in 6 regions of China and the share in the national total [7]. 
We apportion the number of travelers between China and Hong Kong or Taiwan according to the 
share. For example, the share of Beijing airport in the national total is 13.7859%. The number of 
travelers from China to Hong Kong is 5,692,500. The number of travelers from Hong Kong to China 
is 58,770,063. Thus the total number of travelers between China and Hong Kong is 64,462,563. 
The number of travelers between Beijing to Hong Kong is obtained as; 0.137859 x 64,462,563 ~ 
8,886,700. 
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Table 7: Parameter values in simulation 



Parameter 


Value 


Pg 


2.0 X icr 7 


J'l, 


Depends on Densityi , C\ , and C'2 


Do 


5.0 X 10" 9 


Ol 


2.5 X 10~ 7 


Cx 


7.23 x 10~ 9 


C 2 


7.69 x 10"° 


Incubation _Period 


10 


Infectious -Period 


10 


Run-Cycle 


100 


Densityi 


See Table 


2 




Populationi 


See Table 


2 




T ■ 

1 '3 


See Table 


8 





4 Results 

4.1 Results for China, Hong Kong, and Taiwan 

For the preliminary experiment, we simulate with 6 regions in the Chinese mainland, Hong Kong, 
and Taiwan. The number of susceptible agents in each region/country is initially equal to the 
population of each country. Table [7] shows the summary for parameter values used in simulation. 
These simulation parameters were chosen to provide a good fit to data from this initial simulation, 
but we discuss below several inferences that can be made because many of the parameters are 
constant, exploiting the granularity of the model. Then, in the second part of this section, we use 
the same parameters to extend the model to 30 countries/regions, which provides a test for the 
parameters, allowing us to evaluate the benefits and drawbacks in a validation setting. 

Figure [4] show the transition of the number of infected cases and the number of cumulative cases 
respectively, comparing real data and the results of our model. For the model we show data from 
time cycles 45 through 75. The results of China's 6 regions are summed up and the total is shown 
for China. 

The figure shows that our model captures both the dynamics of the spread of SARS, as well as 
the total numbers, very well. The peaks come in order: Hong Kong, China, and Taiwan. The model 
achieves this without using any special parameters that vary in different countries. Populations, 
densities, and travel data are all taken from the real world. The SARS epidemic started spreading 
from Hong Kong and immediately reached mainland China. The peak of Hong Kong comes earlier 
than that of China since the population density is higher. However, the curve decreases from some 
point because the percentage of susceptible agents in the population decreases and the percentage 
of Recovered agents increases. Then the number of Infectious agents decreases. After that, the 
number of infected agents in China increases. Because of its population, the number of infected 
agents at its peak in China is the largest among the three countries/regions. The peak in Taiwan 
is slightly delayed because of the time lag in the infection reaching Taiwan. 

Region- wise Breakdown Figure p] shows the predicted (from the model) and actual number of 
cases for each of the eight modeled regions. While the fit is good for several of the most important 
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(a) Real 



(a) Real 
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(b) Simulation 



(b) Simulation 





Figure 4: Three country model for dynamics of the spread of SARS (left) and the cumulative 
number of cases (right), comparing reality and model predictions 

(in terms of number of cases) regions, and therefore the overall numbers are good, there are some 
discrepancies for some of the regions that had a relatively fewer number of cases. Specifically, the 
model underpredicts the number of cases for some of the less densely populated provinces of China 
(Shanxi, Inner Mongolia, and Hebei) and overpredicts for one of the more densely populated regions 
(Tianjin). There are idiosyncratic events associated with the spread of any pandemic, so it is not 
entirely surprising that some of the results do not match perfectly. The next section considers 
anomalies in more detail, where some data is available. But it is important to note that the level 
of granularity in modeling is very important. Figure [6] shows the difference in the model in two 
cases: one where the six infected provinces in China are modeled independently, and one where 
the six provinces are aggregated into one, using aggregated data on population density, travel etc. 
The figure clearly shows that the more granular model is a much better fit to the data. 

4.2 Modeling 30 Countries/Regions 

As mentioned above, we use the parameters from the 8 region/country simulation to extend the 
model to 30 total countries (35 region/countries, since we continue to divide China into 6 regions). 
Again, we use real population, density, and international travel data from the 27 new countries (for 
Canada and Vietnam we use only Toronto and Hanoi, since only these regions had local spread 
cases |16j). Table M show the expected number of travelers between countries/regions. We again 
apportion the number of travelers between each region in mainland China and other countries based 
on the share of each region. 
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Figure 5: Total cases predicted in simulation and in reality for the eight modeled regions 
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Figure 6: Model predictions for total infection in China, Hong Kong and Taiwan when splitting 
the 6 infected provinces versus aggregating them into one for modeling 
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Table 8: Expected number of travelers between countries/regions, 2004 (from 











Origin 








Destination 


Beijing 


Tianjin 


Hebei 


Shanxi 


Inner Mongolia 


Guangdong 


Hong Kong 


Beijing 





1,009,387 


13,140,321 


6,484,244 


5,717,286 


29,449,654 


8,886,773 


Tianjin 


1,009,387 





5,542,547 


2,735,034 


2,411,533 


12,421,773 


736,718 


Hebei 


13,140,321 


5,542,547 





35,604,996 


31,393,629 


161,708,110 


165,808 


Shanxi 


6,484,244 


2,735,034 


35,604,996 





15,491,550 


79,796,745 


685,183 


Inner Mongolia 


5,717,286 


2,411,533 


31,393,629 


15,491,550 





70,358,368 


337,103 


Guangdong 


29,449,654 


12,421,773 


161,708,110 


79,796,745 


70,358,368 





8,711,676 


Hong Kong 


8,886,773 


736,718 


165,808 


685,183 


337,103 


8,711,676 





Taiwan 


376,618 


31,222 


7,027 


29,038 


14,286 


369,197 


694,412 


Australia 


58,114 


4,818 


1,084 


4,481 


2,204 


56,969 


326,192 


Canada 


42,295 


3,506 


789 


3,261 


1,604 


41,462 


233,432 


France 


127,391 


10,561 


2,377 


9,822 


4,832 


124,881 


64,800 


Germany 


67,562 


5,601 


1,261 


5,209 


2,563 


66,231 


88,100 


India 


33,121 


2,746 


618 


2,554 


1,256 


32,468 


114,770 


Indonesia 


37,595 


3,117 


701 


2,899 


1,426 


36,855 


205,328 


Ireland, Republic of 


1,746 


145 


33 


135 


66 


1,712 





Italy 


26,404 


2,189 


493 


2,036 


1,002 


25,884 


571,866 


Japan 


372,714 


30,898 


6,954 


28,737 


14,138 


365,371 


823,514 


Korea, Republic of 


338,958 


28,100 


6,324 


26,134 


12,858 


332,279 


381,573 


Kuwait 


548 


45 


10 


42 


21 


537 


14 


Macao 


2,783,184 


230,727 


51,928 


214,587 


105,575 


2,728,346 


1,374,748 


Malaysia 


107,632 


8,923 


2,008 


8,299 


4,083 


105,511 


220,027 


Mongolia 


70,114 


5,813 


1,308 


5,406 


2,660 


68,733 


380 


New Zealand 


15,084 


1,250 


281 


1,163 


572 


14,787 


61,247 


Philippines 


67,519 


5,597 


1,260 


5,206 


2,561 


66,188 


318,453 


Romania 


2,345 


194 


44 


181 


89 


2,299 





Russian Federation 


284,026 


23,546 


5,299 


21,899 


10,774 


278,430 


3,585 


Singapore 


130,496 


10,818 


2,435 


10,061 


4,950 


127,924 


410,460 


South Africa 


8,472 


702 


158 


653 


321 


8,305 


18,600 


Spain 


4,089 


339 


76 


315 


155 


4,009 


21,500 


Sweden 


6,899 


572 


129 


532 


202 


6,763 





Switzerland 


11,890 


986 


222 


917 


451 


11,655 


45,642 


Thailand 


124,024 


10,282 


2,314 


9,562 


4,705 


121,581 


790,020 


United Kingdom 


49,123 


4,072 


917 


3,787 


1,863 


48,155 


366,100 


United States 


135,080 


11,198 


2,520 


10,415 


5,124 


132,418 


646,612 


Vietnam 


113,573 


9,415 


2,119 


8,757 


4,308 


111,336 


3,264 
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Figure 7: Infection route of SARS in simulation 

Figure [7] shows the infection route in our model. Most countries are infected from Hong Kong 
or Guangdong. Some countries are infected from other regions. For example, Vietnam, Mongolia, 
and Russia are infected from Beijing. 

Figure [8] shows the comparison of the number of cumulative cases in simulation with real data. 
Especially for the significantly impacted countries, the number of cases corresponds well. In the real 
world, there were 8 countries/regions which had local spread. In the model, 18 countries/regions 
develop local spread. There are four true statistical outliers in the data in terms of number of cases 
predicted by the model versus number of cases experienced in reality. These are Singapore, Macao, 
Canada, and Japan. 



Discussion of anomalies We hypothesize that the outliers in this case are related to the nature 
of the spread of SARS. An early, chance outbreak, in a country or region can lead to significantly 
more cases than expected. Similarly, if a country manages to avoid a case of SARS for longer than 
predicted by international travel data, heightened awareness and prevention strategies will lead to 
many fewer cases than expected. For SARS in particular, this factor may be particularly important, 
because there is considerable evidence that some people infected with SARS are "super spreaders" 
who may affect the trajectory of the spread. While an infected person infects, on average, 1-3 
people [32] , some infected people pass the virus to many other people [TB] . Although it is not clear 
what causes someone to become a super spreader, it is suspected that a person who has a chronic 
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Figure 8: Comparison of number of cumulative cases in 30 countries/regions ((a) Top 6 coun- 
tries/regions, (b) 24 other countries/regions). Note the different Y axes. 
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illness such as diabetes is more likely to be a super spreader [32J . The origin of S ARS is a case in 
point. A physician became ill on February 15th 2003 after caring for patients who had developed a 
strange new form of pneumonia in Guangdong. He stayed at the Metropole Hotel in Hong Kong on 
February 21st. On March 4th, he died of what would later be called SARS. During his one-night 
stay at the Metropole Hotel, the SARS virus had passed to at least 15 other guests at the hotel. 
The virus then spread around the world, leading to outbreaks in other countries [16] . 

In each of the outlier cases, where the model makes a significantly different prediction than 
the actual trajectory of the pandemic, it turns our that the first reported case happened at a 
different time than would statistically be predicted by travel flows. While Macao, Japan, and 
Republic of Korea have large numbers of travelers from/to China and Hong Kong, these countries 
experienced much less infection than predicted by the model. It turns out that each of these 
countries experienced its first infection at a much later date than predicted, as shown in Table |9j 
Republic of Korea imported the first case in April 25th, Macao imported the first case in May 5th 
2003, and Japan was never infected. These countries imported their first cases one or more months 
after Vietnam, Canada, Taiwan, Singapore and the Philippines. Meanwhile, Canada, despite being 
less strongly linked by travel to China and Hong Kong, was infected on February 23rd, early in the 
pandemic (in fact, from the original super-spread event at the Metropole Hotel). 

To provide some more weight to this hypothesis, we ran the model, but this time using the actual 
time of first infection in the country rather than travel data. Other than that, the parameters of 
the simulation remained the same. Figure [9] shows that the cumulative number of cases from the 
model then correspond better to real data. 

Local Considerations Our model trades off adaptability to local conditions for a smaller number 
of parameters to fit. This can have several effects. Here we discuss two of them, and how they 
might affect the results. First, if we look at data from seasonal flu cases, we find that Canada 
typically has a large number of cases, and the United States has the largest number of influenza 
isolates [16] . Both of these suggest that the local infection coefficient may be higher in Canada and 
the United States than other countries. Indeed, this could have been an additional factor in the 
surprisingly large number of Canadian cases. However, the United States was surprising, because, 
although it imported 27 cases, the infection did not spread locally. This may indicate that the 
quarantining and isolation measures employed worked effectively. 

A second interesting point is that Singapore and Vietnam both report many fewer cases than 
predicted by the model. This may be partly explained by their lower propensity to spread infection, 
again as evidenced by seasonal flu data. There may also have been a significantly bigger push to 
hospitalize and keep patients confined, weakly evidenced by the fact that the proportion of those 
infected who were healthcare workers in these two countries (41% and 57% in Singapore and 
Vietnam respectively) was much higher than other countries (21%). 

5 Discussion 

We have discussed a hybrid network and local model for the spread of pandemics, and applied 
it to the case of SARS. When parameters are calibrated to real data on populations, densities, 
and traffic, we show that the model reproduces many of the key dynamics of the spread of SARS 
in 2002 and 2003, while remaining parsimonious, and therefore useful for understanding the root 
causes of why pandemics spread in the way they do. Both the successes and the failures of the 
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Table 9: Infected countries/regions and the date of onset (dark-gray: country with local infection, 
light-gray: country with only imported cases) |16j 



Country 



Date of Onset: Imported Cases Total Cases Percentage of 

First Probable Case Imported Cases 



China 


16-Nov 2002 


NA 


5,327 


NA 


Hong Kong 


15-Feb 2003 


NA 


1,755 


NA 


Viet Nam 


23-Feb 


1 


63 


2 


Canada 


23-Feb 


5 


251 


2 


United States 


24-Feb 


2V 


27 


100 


Taiwan 


25-Feb 


21 


346 


6 


Singapore 


25-Feb 


8 


238 


3 


Philippines 


25-Feb 


7 


11 


50 


Australia 


26-Feb 


6 


6 


100 


Ireland, Republic of 


27-Feb 


1 


1 


100 


United Kingdom 


1-Mar 


4 


4 


100 


Germany 


9-Mar 


9 


9 


100 


Switzerland 


9-Mar 


1 


1 


100 


Thailand 


1 1-Mar 


9 


9 


100 


Italy 


12-Mar 


4 


4 


100 


Malaysia 


14-Mar 


5 


5 


100 


Romania 


19-Mar 


1 


1 


100 


France 


21-Mar 


7 


7 


100 


Spain 


26-Mar 


1 


1 


100 


Sweden 


28-Mar 


5 


5 


100 


Mongolia 


31-Mar 


8 


9 


89 


South Africa 


3- Apr 


1 


1 


100 


Indonesia 


6- Apr 


2 


2 


100 


Kuwait 


9- Apr 


1 


1 


100 


New Zealand 


20- Apr 


1 


1 


100 


Korea, Republic of 


25-Apr 


3 


3 


100 


India 


25- Apr 


3 


3 


100 


Macao 


5-May 


1 


1 


100 


Russian Federation 


5-May 


1 


1 


100 
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Figure 9: Comparison of number of cumulative cases in 30 countries/regions considering actual 
time of first infection ((a) Top 6 countries/regions, (b) 24 other countries/regions) 
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simple model provide insights into pandemic spread. For example, it is clear that it is important 
to model international traffic to understand the pathways of spread. At the same time, for any 
particular pandemic, individual idiosyncrasies can come into play For example, the importance of 
super-spreaders in SARS is reflected in the fact that the time of first infection in a country has a 
big role in how many people get infected. The other major takeaway from this work is that the 
level of granularity in the network structure of the model has a significant impact. For example, 
treating China as one large entity leads to poorer prediction, but at the same time specializing all 
the way down to cities would end up requiring too much data to accurately calibrate the model, 
and would probably not provide significantly better prediction. 
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